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Abstract — We consider allcast and multicast flow problems 
where either all of the nodes or only a subset of the nodes 
may be in session. Traffic from each node in the session has 
to be sent to every other node in the session. If the session does 
not consist of all the nodes, the remaining nodes act as relays. 
The nodes are connected by undirected links whose capacities 
are independent and identically distributed random variables. 
We study the asymptotics of the capacity region (with network 
coding) in the limit of a large number of nodes, and show 
that the normalized sum rate converges to a constant almost 
surely. We then provide a decentralized push-pull algorithm 
that asymptotically achieves this normalized sum rate without 
network coding. 

Index Terms — allcast, broadcast, Erdos-Renyi random graph, 
flows, matching, multicast, network coding, random graph, 
Steiner tree, tree packing 

I. Introduction 

In this paper, we investigate the capacity of allcast and 
multicast sessions over random link-capacitated graphs. Two 
questions motivated us to study these problems in the context 
of random graphs. 

(1) While it is known that network coding in general 
provides a large coding advantage over multicast flows in 
directed graphs, Li et al. 1 1 1 showed that the coding advantage 
in undirected graphs is upper bounded by 2. In some specific 
topologies a tighter upper bound is known IJ) . However sev- 
eral simulation experiments showed nearly no coding advan- 
tage for some class of random undirected graphs 13 |. Is there 
a provable statement that there is negligible multicast coding 
advantage for a rich class of random undirected networks? 

(2) If we stick to the domain of flows (with duplication), 
as we will soon see, optimal allcasting and multicasting lead 
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to tree and Steiner tree packing problems respectively. While 
packing of trees is known to be easy (see [4], [51, |6|), 
Steiner tree packing is known to be hard Q. Due to its 
application in multicasting over wired networks and in VLSI 
layout optimization, practitioners and theorists have over many 
years provided hardness results, heuristics, and approximation 
algorithms (see f9l, fl\, IM, CI], etc.) Are there "quick- 
but-dirty" (terminology from lfT2l ). decentralized, scalable, yet 
near-optimal algorithms for allcasting and multicasting over a 
rich class of random undirected networks? 

In this paper, we provide affirmative answers to both these 
questions. We begin by making precise what we mean by 
allcast and multicast. 

Allcast: Consider a setting where there are n nodes, all 
of which are engaged in a conference over a wired network. 
Each node has data that needs to be made entirely available 
over the network to each of the other n — 1 nodes in a 
simultaneous fashion. (To be more precise, this is a multiple 
allcast problem). The data can be split, or routed, or coded, 
or transmitted in any combination thereof, so long as all 
nodes eventually get the information. The underlying complete 
undirected graph on n vertices is capacitated: each undirected 
link e has capacity Cg sampled independently and identically 
from a distribution F. An allcast information flow assignment 
is said to be feasible if for every link, the net (possibly coded) 
flow over the link (summed over both directions) respects the 
link's capacity constraint. For each feasible flow assignment, 
let ri be the bit-rate of traffic sent by node i to each of 
the other nodes. We address the question of the set of all 
achievable rate tuples ri , • • • , r„ in the asymptotics of a large 
number of nodes n. As we shall soon see, this problem is 
closely related to packing of disjoint spanning trees in a link- 
capacitated network with integer capacities. Minor extensions 
of some previous results readily yield that the achievable rate 
region is almost surely (a.s.) 
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where the expectation is of a random variable C having 
distribution F. The linear programming formulation of this 
problem is given in Section |II] and the proof of ^ is given 
in Sections Hill (converse) andllVI (achievabilitv). Our proof of 
achievability is via a combination of "push" and "pull" that 
suggests a decentralized implementation. Section [V] contains 
some estimates needed to establish the correctness (with high 
probability) of the push-pull algorithm. Section lVIIII deals with 
the case when the link probabilities vanish, but not too quickly. 



2 



It is known that network coding does not yield any gain in 
allcast settings (T\, and thus we have an asymptotic character- 
ization of the allcast capacity region. 

Multicast: We next address a more general setting with 
only a subset of fc„ nodes in the multicast session, where 
lim„_j.oo kn/n — a and < a < 1. Data from each of the fc„ 
nodes has to reach every one of the other fc„ — 1 nodes. The 
remaining n — fc„ nodes serve as relays. This is therefore a 
problem of multiple multicast among common session nodes. 
Again, in a link-capacitated framework where each link is 
independent and identically distributed (iid) with distribution 
F, we are interested in the set of all achievable rate tuples 
ri, - ■ ■ , in the asymptotics of a large number of nodes n. 
We demonstrate that the capacity region is almost surely 

(ri,r2,...) : limsup - V < (l - ^) E[C]. (2) 

2=1 ) 

The LP formulation of this problem is in Section [III proof of 
the converse is in Section |III] and proof of achievability is 
in Section IVIII Here too, our proof of achievability is via a 
decentralized push-pull algorithm. Section |VI] is a digression 
to study single commodity flows over random networks and 
develops the ingredients necessary to establish the correctness 
(with high probability) of the push-pull algorithm. 

Our achievability proofs are based on flows (allowing for 
duplications) and thus do not employ network coding. In 
particular, they establish that any gain from network coding in 
multicast settings is at best sublinear in the number of nodes. 
Schemes very similar to our push-pull algorithm have been 
proposed and are being used over the internet for content 
distribution in peer-to-peer networks. See ifTJl Sec. 1-2] for 
an excellent survey of such techniques. Our work proves that 
a version of it is asymptotically optimal for a rich class of 
random networks. 

II. A Linear Programming Formulation 

A. Random graph models 

We are given a countable collection of iid random variables 
{Ci.j, ^ < i < i < oo} where each element has distribution 
F on R_|_. We then obtain a sequence of graphs, denoted 
{Km n > 1}, where for each n, the graph Kn is the complete 
graph on the vertex set {1, 2, . . . , n} along with the collection 
of all (2) links. Each link with I < i < j < n has link 
capacity Cij. 

Later on, we will have a need to study Erdos-Renyi random 
graphs where the link capacity distribution is Bernoulli(p), 
which is Pr{C = 1} = p and Pr{C = 0} 1 - p. If 
Ci,j — 0, then the undirected link {i,j) has zero capacity 
and is effectively absent. We then use the notation G{n,p) to 
denote the obtained graph for a fixed n. 

We will also study Erdos-Renyi random graphs where p 
depends on n and vanishes with n. We shall denote these 
G{n,pn). These may be constructed as follows. We assume 
that we are now given a collection of iid random variables 
< i < j < 00} where each Zij has the uniform 
distribution on [0, 1]. The graph G{n,pn) is the graph on n 
vertices {1,2, ... ,n} where each link {i,j} with 1 < i < 



j < n has binary capacity Cij — < Pn}- The notation 

1{- • • } stands for the indicator of an event. This construction 
is of course consistent with the construction of G{n,p) when 
Pn = p is a constant. 

Finally, we will also study random bipartite graph se- 
quences {G{n,n,p),n > 1} and {G{n,n,pn),n > 1}. These 
are constructed from the collection of iid random variables 
{Zi j,i > l,j > 1} where once again each entry has 
the uniform distribution on [0,1]. In the graph G{n,n,pn), 
for example, there are 2n vertices with vertex set Vi U V2 
where Vi = {vi,V2, . . .Vn} and V2 = {wi, W2, . . . w„}, and 
the capacity on the link between node Vi and node uij is 

Cij" = ^{^i,j — Pn}- 

B. Allcast 

Consider the allcast problem described in Section U Li et 
al. prove in [1, Cor 4. a] that a multiple allcast rate vector 
(ri, r2, . . . , r„) is achievable in an undirected capacitated 
network if and only if the rate vector (X^iLi ^ij 0; ■ • ■ j 0) is 
achievable, i.e., the sum rate is achievable for a single allcast 
with node 1 as sender and with the other n — 1 nodes as 
receivers. This is intuitively clear since network coding does 
not help for allcast, and one can make do with multicommodity 
flows in multiple allcast. 

We may therefore assume that there is only one sender (say 
node 1), and all other n — 1 nodes are recipients that must 
receive all information sent by node L The rates in such a 
setting are given by (ri, 0, 0, . . .), and we characterize ri. 

This maximum rate is obtained by solving the following 
linear programming (LP) problem. Consider the graph Kn on 
n vertices with associated link capacities. Let Tn be the set of 
all spanning trees on the complete graph (ignoring capacities). 
The vertices are labeled, and so Cayley's formula tells that the 
number of such trees is n"^^. Solve the LP (Tutte L4J, Nash- 
Williams m, Barahona 16], Li et al. UJ): 

Maximize At (3) 

Ter,. 

subject to (a) At < Ce for all e 

TeT„:T3e 

(6) At > for aU T eTn. 

Denote the maximum value of (O as 7r„. Then 7r„ is the 
maximum rate at which node 1 can allcast its information to all 
the other nodes. The LP has a simple and intuitive explanation. 

• If one tags an infinitesimal information element originat- 
ing at node 1 and follows the path of its spread to each 
of the n — 1 recipients, one gets a directed graph rooted 
at the source node 1 and spanning all the n nodes. 

• If the undirected version of this directed graph is not 
a tree, i.e., there is some cycle, then some node in the 
cycle is receiving this information element from two other 
nodes. One of these two incoming links can be removed 
without affecting the allcast property. We can thus reduce 
the directed graph to a spanning arborescence, which is 
a directed graph with no incoming links at the root node, 
exactly one incoming link at every other node, and all 
vertices are covered. 
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• This spanning arborescence is in one-one correspondence 
with a tree, because the root is specified as node 1 . So we 
may simply focus on the spanning tree associated with 
the arborescence. Call this tree T (which is in Tn)- 

• Collect all information elements that are spread via this 
tree. Call its volume Xt- 

It is clear that each At > and constraint (a) in (O is 
the capacity constraint associated with each of the links. 
Consequently, the value of the optimization problem in (O 
is an upper bound on the optimal net flow from node 1. 
But it is immediate that any set of At satisfying the two 
constraints provides a means to achieve a rate J^t ^t, since 
At units of information may be directed through the spanning 
arborescence associated with the tree T and root vertex 1. 
Thus the maximum rate of allcast flow from a single sender 
is 7r„, the solution to the LP in (|3]l. 

When link capacities are random, 7r„ is a random variable 
whose asymptotics we shall soon characterize. 

C. Multicast 

For the multicast problem, without loss of generality, let us 
index the session nodes as {1, 2, . . . , fc„}. As for allcast, by y_, 
Cor 4. a], a multiple multicast rate vector (ri , r2 , . . . , r^^ ) with 
identical session nodes is achievable in an undirected capaci- 
tated network if and only if the rate vector (X^t^i ''ii 0, . . . , 0) 
is achievable, i.e., the sum rate is achievable for a single 
multicast with node 1 as sender and with the other fc„ — 1 
nodes of the session as receiver^ We may therefore assume 
that there is but one sender, he is node 1, and all other fc„ — 1 
nodes are recipients that must receive all information sent by 
node 1. Denote by 7^(fc„) the set of all Steiner trees that 
span the vertices l,2,...,fc„. Obviously Tn{n) = Tn- For 
multicast, again as for allcast, the maximum simultaneously 
transmissible rate from one sender (node 1) to the fc„ — 1 other 
recipients is the maximum value of the modified LP (O, lfT4l . 
IID): 

Maximize 



E 

T6r„(fc„ 



(4) 



subject to (a) 



T6r„(fc„):T9e 

At > 



At < a 



for all e 



for all T e r„(fc„). 



Set a„ = kn/n, and denote the maximum value of (|4|i as 
T^n{(Xn)- The above LP is the same as that of Q with Tn 
replaced by the less restrictive Tn{kn)- 

Again, when link capacities are random, 7r„(a„) is a random 
variable whose asymptotics we shall soon characterize. 

III. An Upper Bound 
Consider the following definitions. 

• Let Xn and Xn{kn) denote the maximum throughput 
achievable in the allcast and multicast settings with the 

'There is some subtlety involved here since, in general, network coding 
provides a coding advantage for multicasting in undirected networks; see [ 1 
Th. 4] for a proof of source independence in the single multicast case which 
is then generalized to get |l] Cor. 4. a] 



added possibility of network coding at each node. (The 
dependence of these quantities on the link capacities is 
understood and suppressed). 

Let rjn denote the strength of the allcast network defined 
as follows. Let V denote the set of all partitions of the 
vertex set {1, 2, . . . , n}. Consider a partition p ^ V. Let 
dp denote the set of intercomponent links. Define 



. E 

r]n ■■= mm 



ev \p\-l 



(5) 



where |p| denotes the number of subsets in the partition. 
Let rin{kn) denote the strength of the multicast network 
with fc„ nodes in the session. This is defined as follows. 
Let V{kn) denote the set of all partitions of the vertex 
set {1, 2, . . . , n} such that each component of a partition 
contains at least one of the session nodes {1, 2, . . . , fc„}. 
Define 



1]n{kn) 



mm 



IpI 



(6) 



Li et al. [I] showed the following result. 
Theorem 1: (Li et al. E Th. 2 and Th. 3]) 

(a) For any allcast session, tTu = Xn = Vn- 

(b) For any multicast session, 7r„(fc„) < Xn{kn) < Vnikn)- □ 

We can easily find good upper bounds on rjn and ?y„(fc„) 
in random settings as shown in the following theorem. 

Theorem 2: Let {Ci.j}i<i<j<„ denote the undirected link 
capacities. We then have the following upper bounds: 



T]n {kn 



< 



< 



l<i<j<n 

\i<fc„i>fc„ i<j<i<fc„ 



(7) 



C^J (.8) 



As a consequence, with lim„ 

Vn 



lim sup 

n- 

lim sup 



< 



. kn/n = a, the inequahties 

1, 



n 

Vnikn) 



-E[C] 



n 



(9) 
(10) 



hold almost surely. □ 
Proof: Consider the partition p — {{!}, {2}, • • • , {n}}. 
There are n subsets in the partition, and dp is the set of all 
links. Apply now the definition ^ of rjn and we immediately 
get (|7]i as the upper bound for the allcast case. 
For the multicast case, consider the partition 

p={{l},{2},... ,{fc„-l},{fc„,...,n}}. 

There are fc„ subsets in the partition. The set of links in dp 
are 

{{i,j) ■.l<i< kn,j > kn}U{{i,j) ■.l<i<j< kn}. 

Apply now the definition ^ of 77„(fc„) and we immediately 
get (O as the upper bound for the multicast case. 
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Note that \dp\ — n{n — l)/2 for allcast, and 



\dp\ 



{kn - l)(n- /c„ + 1) + 



= (fc„ - 1) ( n - y 



(11) 



for multicast. 

Using \dp\ — n{n — l)/2 for allcast in O, we obtain 



- 2 \dp\ 



n 



The sum on the right-hand side is composed of independent 
and identically distributed random variables. Consequently, the 
right-hand side converges almost surely to iE[C] by the strong 
law of large numbers, and we obtain (|9]l. 

For the multicast case, use (fTTl i in (|8]l to obtain 



< 1 



Again by an application of the strong law of large numbers, 
the conclusion ( fTOl i follows. ■ 
Observe that, by Theorem[Tl the upper bounds in Theorem|2] 
apply for capacity with the possibility of network coding. Let 
us now turn to achievability of these rates in their respective 
settings. 

IV. Allcast: Achievability 

In this section we consider the allcast setting and argue that 
the upper bound in (|9]l is tight, and moreover, the upper bound 
is achievable via flows. After first establishing the existence 
of a scheme, we then provide a practical decentralized asymp- 
totically optimal push-pull algorithm. 

Theorem 3: For the allcast problem, we have 



lim ^ = -E[C] 



a.s. 



□ 

Proof: The fact that we cannot do better than E[C]/2 
was already established in (|9]l. So the proof of the above 
theorem would be complete if we can establish that E[C]/2 is 
achievable. We first argue achievability on the simpler Erdos- 
Renyi graphs. We then lift this result to the general case. 

Take the random graph G{n,p) where each link capacity 
is iid with Bernoulli(p) distribution. Catlin et al. [15, Sec. 3] 
proved the stronger result that, even if p vanishes with n, 
so long as it is larger than (28 logn/n)^/'^, we have for all 
sufficiently large n the equality 

J2l<i<j<n^iJ 

a.s. (12) 

n — 1 

For any e > 0, using p > 0, the result in ( fT2] i. and the strong 
law of large numbers, we have 



liminf — > ^(1 — e) a.s. 

n— >oo n 2 



(13) 



By excluding all null sets associated with rational e g (0, 1), 
it follows that 

iimmi — > — a.s. 

n— >-oo n 2 



There now remains the step of lifting this result to any 
generic distribution F, for the iid capacities Cij, satisfying 

/•oo poo 

< E[C] = / Pr{C > x} dx^ [1 - F{x)] dx < oo. 

^° (14) 
This is readily done. Fix an arbitrary e > 0. By (fT4l i and the 
fact that the function 1 — F{x) is Riemann integrable (for it 
is Lebesgue integrable, bounded, and has at most a countable 
number of discontinuities), we can choose a natural number 
M < oo and S > such that 



E 

k=l 



J • [1 - F{kS)] > E[C] • (1 - e) 



(15) 



We now build a family of M coupled graphs, each with n 
vertices. For a realization of the iid link capacities, let Gk be 
a new graph on the n vertices with link between i and j if 
and only if dj > kS, for k — 1,2,..., M. Clearly, Gk is an 
Erdos-Renyi graph on n vertices with parameter 

p{k) Pr{C >kS} = l- F{kS). 

On Gk, we interpret each link, if present, as having capacity 
S. While the graphs are coupled across the parameter fc, for 
a fixed fc, the links on the graph Gk are iid Bernoulli(p(A:)) 
random variables. Let 7r„(G'fc) be the maximum number of 
disjoint trees that can be packed in Gk- By the result (fT3b 
applied to each fixed k, we have 

M 



lim inf — > lim inf — S ■ 7r„ (Gk) 

k=l 

k=l 

1 

= -^<5.[l-F(fc<5)].(l-£) 



k=l 

> f E|C|-(l-£)-(l-£) 

where the penultimate inequality follows from ( fTSl ). It follows 
as before that lim„ ^ > almost surely. This completes 
the proof. (See lfT6ll or ifTTl for a similar truncation, quantiza- 
tion, and scaling argument). ■ 
The key to proving Theorem [3] is the result ( fT3T l on Erdos- 
Renyi graphs. In order to show this, we utilized the result (fT2l) 
of Catlin et al. fT5|. The main point of the rest of this section 
is to demonstrate that (fTsl i can be proved constructively using 
a rather simple and decentralized algorithm. 

A. ALLCAST." A decentralized algorithm for allcast in a 
random graph 

In this section we describe a decentralized push-pull 
algorithm for allcast that achieves ( fT3] ) for an arbitrary e > 0. 
For ease of exposition, we shall assume a total of n + 1 nodes 
with node as the source node. The source node has to 
push a total of ^np{l — e) bits to all nodes. We have ignored 
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Owners Relays 




Figure 1. Graph showing the three sets of nodes: source, owners, and relays. 
Source pushes bits to owners who then push to relays. All nodes then pull 
from owners and any remaining bits from relays. 



integer rounding and a factor {n + l)/n both of which are 
easily absorbed into e. The algorithm broadly has two push 
steps and two pull steps, as described next. See Figure [T] The 
analysis that comes later will argue that with overwhelming 
probability none of the steps fail. 

Algorithm ALLCAST: 

• Setting up of directions: All links that do not involve 
the source node are assigned one of the two directions 
with equal probability, independently of the choices of 
directions at other links. All links that involve the source 
node have a direction pointing away from the source. 

• Push step 1: Source node pushes \np(l — e) different 
bits to that many of its neighbors. We number the bits 
hi, 62, ... , &np(i_£)/2, call the respective recipient nodes 
as owners of these bits, and denote the owners (some- 
times) as Oi, O2, • • • , Onp(i-e)/2 instead of saying node 
1, node 2, . . . , node np{l — e)/2. There may be several 
other neighbors of node 0, but the corresponding links are 
left unused. These and other nodes who are not owners 
are called relays, and are denoted Rnp(i-e)/2+i^ ■ ■ • i Rn 
(instead of saying node np{l — £)/2 + 1, . . . , node n). 

• Push step 2: Each owner Oi pushes his bit hi one more 
level along links that point outward from i, regardless of 
the status of the recipient as an owner of another bit or a 
relay. The receiving node will then have hi (and similarly 
many other bits) for other nodes to pull in the next couple 
of steps of the algorithm. 

• Pull step 1: Each node, say node j, collects all incoming 
bits hi coming directly from owners Oi via links i ^ j. 
(This is the bit pushed by Oi in push step 2). 

• Pull step 2: Having collected some bits directly from 
owners, node j identifies the remaining bits, the relays 
to which it is connected with direction pointing towards 
j, and the bits that these relays have available having 
received the bits directly from owners. A representation 
of this information is the bit-map matrix of nodes and 
bits they have available for pulling (see Table J] and its 
description). Node j then identifies a complete matching 
of these desired bits to the helper relays: each desired 
yet-to-be-pulled bit is pulled from a suitable relay that 
has the bit, with each relay accounting for one bit, and 
this constitutes a matching. □ 



Before we dive into an analysis of this algorithm, we 
describe the bit-map of Table H] in more detail. The rows and 
columns are indexed as 

Ol, O2, ■ • • , 0„p(i_£)/2, -R„p(i_e)/2 + 1, . . . , Rn- 

In addition, the first np{l — e)/2 columns wiU also refer to 
the corresponding bits. 

• For 1 < i < np{l — s)/2, we write Xi ^ = 1 to signify 
that node Oi has bit 6,;. 

• For i 7^ j, since the link {i,j} itself occurs with 
probability p, and further, may have either direction with 
equal probability, we have 

Xi^j = l,Xj^i = if j ^ i; 
Xj^i^O,Xij^l if i^j; 
Xj,i — 0, Xij — if no link between i and j. 

These are mutually exclusive, with the first setting oc- 
curring with probability p/2, the second setting with 
probability p/2, and the third setting with probability 

l-p. 

• If Xi,j ~ 1, then node i (owner or relay) can obtain bit 
hj from owner Oj (if 1 < j < np{l — e) /2) or some bit 
that relay Rj has (if j > np{l - e)/2). 

> The set of bits node i receives directly from owners 
corresponds to the set of Is in the first np{l — e)/2 
columns of the ifh row, for if Xi^j ~ 1, then owner 
Oj pushes his bit hj to node i. (For example, in Table H] 
owner Ot has bits hi, 62, ht, &np(i-e)/2^ but does not have 
ha, hb, he). 

m The Is in the ith row beyond column np{l — e)/2 point 
to relays that can be used by node i to pull any remaining 
bits in pull step 2. (For example, owner Ot is connected to 
relays Ru, Rv, Rw with directions pointing towards Ot- 
These relays will help node Ot get the yet-to-be-pulled 
bits ha,hb,hc). 

m Clearly, while the random variables Xij and Xj,i are 
coupled, the nondiagonal entries of the ith row 

{X,J,l<J<r^,J^^} 

are iid Bernoulli (p/2) random variables, for 1 < i < n. 

The same holds for nondiagonal entries of any column. 
Our main assertion is that the algorithm ALLCAST succeeds 
with high probability in distributing the np{l — e)/2 bits to 
all nodes. 

Theorem 4: For any e > 0, the following event occurs 
almost surely: for all but finitely many n, the algorithm 
ALLCAST succeeds in distributing all 7ip(l ~e)/2 bits to each 
of the n nodes. □ 

Remarks: 1) It follows immediately that, for any e > 0, the 
inequality ( fTsT l holds. 

2) The above theorem also implies that, for all sufficiently 
large n, we can pack np{l — e)/2 disjoint (spanning) trees in 
G{n,p), with each tree having the property that it has depth 
at most 3. 

3) ALLCAST is decentralized in the following sense. The 
direction of each link, when present and if the source node 
is not involved, is picked at random by the toss of a fair 
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Table I 
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coin, and this information is needed only at these two incident 
nodes. The two levels of pushes, and thus the first pull stage, 
are easily seen to be decentralized. At each node, the actions 
depend only on the links incident on it and the agreed upon 
Unk directions. Each node then keeps a list of bits it receives 
from owners. For the final pull stage, each node has to get this 
hst associated with each of its potential helper relays. This is 
the step that may involve significant exchange of information, 
but the cost involved is a one-time set-up cost that can be 
amortized over multiple rounds of data communication. Note 
that all information exchanges (link directions, pushing of 
owned bits, lists of bits available at neighboring helper relays) 
are of information which are of local relevance that are, in 
addition, locally available. The matching can be identified in 
0(n2) steps nil. 

4) We need three elementary tools to establish the result. 
The first is the following well known concentration result for 
the binomial distribution, which we state without proof. 

Let7it7ia 5: (UJ., Th. 1.7(i)]) Suppose < q < ^, < 
£ < 1/12, and enq{l — q) > 12. Let Sn,q be the sum of 
n Bernoulli(g) random variables. Then 



Pr 



1 

nq 



1 



> e)- < 



1 



\fehiq 



(16) 



□ 



This result holds for every n and q satisfying enq( \ — q) > 
12, and as such, q can vary with n. The second tool is the 



Borel-Cantelli lemma that gives us a sufficient condition for 
almost sure convergence. The third tool is one of existence 
of matchings on random bipartite graphs, which will be the 
subject of Section [Vl 

Proof of Theorem ^ By the Borel-Cantelli lemma, it 
suffices to show that the probability that the algorithm fails 
for a particular n is summable over n. If the algorithm fails, 
then at least one of the following is true. 

1) The event ^J"'' occurs, which is defined to be the event 
that there are fewer than ^np{\ — e) vertices connected to 
node 0. By Lemma |5] there is some ci > such that for all 
sufficiently large 12, we have Pr{Aj"^} < e-'^i". 

2) For some node t, the event A^2^\t) occurs, which is 
defined to be the event that the node t is connected to a certain 
number of owners outside the range ^np{l — e) ■ ^p(l ± e) 
with links pointing towards t. (If node t is an owner, there are 
inp(l— e)) — 1 other owners, but the 1 can be absorbed into the 
(1 — e) factor). Again by Lemma|5] there is some C2 > such 
that for all sufficiently large n, we have Pr{4"\i)} < 6""=". 

3) For some node t, the event A3"'' (t) occurs, which is the 
event that the node t is connected to fewer than 



1 , 



-np{l - e) 



1 - 2^(1 - e) 
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relays with links pointing towards t. (Again, the case of 1 less 
relay when node i is a relay is easily handled). Once again by 
Lemma |5] there is a C3 > such that for all sufficiently large 
n, we have Pi{Aip\t)} < e-^^n^ 

4) For some node t, if A^"^ U A^^"\t) U does not 

occur, then the event M^"^(t) occurs, which is the event that 
node t is unable to pull the desired bits. We claim that 

Pr{M(")(t) I (4")u4")Wu4")W)'}<7(/3„) 

(17) 

for some sequence 7 : N — > [0, 1] satisfying 

00 

^n7(/3„) <cx). (18) 

n=l 

The event that the algorithm fails is then a subset of 

n 

A^") U (A"\t) u 4"^(t) u M(")(t)) 
t=i 

whose probability is upper bounded via the union bound and 
dnll by 

n ■ (e-""i + e-"^^ + e-"=^ + 7(/3„)) 

which, by the summability claim in ( fTSl l and the exponentially 
decaying nature of the other terms, is summable. 
Let us now prove ST% and ( fTSl l. 

Fix a node t, where I < t < n. The event A^"^ has not 
occurred, and so the source has sent out exactly ^np{l — e) 
bits to that many owners. The event A2^\t) has not occurred, 
and so node t is connected to between ^np{l — e) ■ ^p{l ± 
e) owners with links towards node t. The connected owners 
directly furnish their bits to node t. But node t needs at least 
^np{l — e) — ^np(l — e) ■ + e) additional bits to be 
pulled in pull step 2. This set of yet-to-be-pulled bits points 
to some random selection of columns from amongst the first 
— e) columns and does not include column t. 

The event A'j^"\t) has not occurred, and so node t is 
connected to at least /3„ relays that could potentially furnish 
these missing bits (that is, with links towards node t). Consider 
the rows corresponding to these relays. This set of rows is a 
random selection of at least /3„ rows from amongst the indices 
^np{l — e) + 1 through n and does not include t. 

Observe that conditioned on these selections, the entries 
of the submatrix continue to be iid Bernoulli(p/2) random 
var iables. If M(") (t) occurs, there is no coverage of these the 
yet-to-be-pulled bits (columns) using the helper relays (rows), 
with each helper relay furnishing at most one missing bit. But 
this in particular implies that there is no coverage of the yet-to- 
be-pulled bits (columns) by some subset of exactly /3„ helper 
relays (rows) with each helper relay furnishing at most one bit. 
But this further implies that any superset of /?„ columns that 
includes the yet-to-be-pulled bits (columns), and continues to 
exclude column t, cannot be matched to the selected /3„ helper 
relays (rows). Now, Lemma |9] of Section |V] shows that this 
probability is upper bounded by 7(/3n), which is (fTTI i. and 
that n7(/3„) is summable, which is dTsl l. This concludes the 
proof. ■ 



The matching step above is the key to complete the deliv- 
eries. It ensures that all required bits are available at some 
helper relay, and that each link has at most 1 bit load so that 
capacity constraints are not violated. We now devote a section 
to demonstrating this key step. 

V. The existence of a bipartite matching 

In this section, we establish the crucial step of existence 
of bipartite matchings. The following lemma, taken from 
Bollobas [18], is key to showing that matchings exist almost 
surely and one can pull the /3„ bits from relays. We first present 
the result for a random bipartite graph with n vertices on 
each side. The results of this section are well-known and are 
provided only for completeness and ease of reference. 

Lemma 6: (fW. Lem. 7.12, p. 174]). Let G be a bipartite 
graph with vertex sets Vi , V2 such that \Vi\ — | V2 1 = n. 
Suppose G does not have any isolated vertices and it does 
not have a complete matching. Then there is a set A C Vi 
for either i = 1 or 2 such that the following three conditions 
hold: 

(i) r(^) has 1^1 - 1 elements, 

(ii) the subgraph spanned by A U T{A) is connected, 

(iii) 2 < < (n+ l)/2. □ 
The above conditions are simple consequences of Hall's 

marriage theorem and some elementary observations. The 
proof can be found in 1 18, Lem. 7.12, p. 174]. We now bound 
the probability of these events on a random bipartite graph 
G{n,n,p) (see Section Hi- Al l. 

Lemma 7: Let Fa be the event that there is a set A of size a 
with A C Vi for i = 1 or 2 satisfying (i)-(iii) of Lemma |6] Let 
m = (n+l)/2. Consider G(n,n,p). Then Pr{U"i2^a} < £« 
where e„ summable, and hence £„ —J- 0. Furthermore, we also 
have J2n>i < 00. □ 
Proof: Fix a. There are two choices for i in the condition 
A C Vi, there are (") ways to choose the subset A, and there 
are (^j"]^) ways to choose the subset T{A). Once chosen, there 
must be no links between the a vertices of A and the n — a+1 
vertices of V2 — r(A). By the union bound (for the possibilities 
for A and T{A)), we get 

Pr{F4 < 2Q -pr^''-''+'l (19) 

Using (") < n", by a second application of the union bound, 
and by dropping some factors that are smaller than 1, we get 

M^T=2Fa} < 2^7i2--i(l_pf"(l-p)--= ^: £„. (20) 

a=2 

For an ag, set uq = 2ao — 1. It suffices to show that 
for no large, J2n>n(, < 00. Interchanging the indices of 
summation, and changing limits appropriately, we get 

E^" - 2^(1 -P)"''' E 

n>nn a— 2 n'>7iQ 

+ 2E(i-p)~"' E -pr- 

a>ao n>2a — 1 

(21) 
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The first term is easily seen to be summable for any finite 
aQ. For the second one, observe that for any S > and any 
C > 0, there is an oq large enough so that for all a > Uq 

and aU n > 2a - 1, we have n^'^^i < n^'^ < C(l + J)"". By 
taking C = (1 - S){1 - (1 - p){l - 5)) it follows that 



n>2a-l 



Choose 6 small enough so that (1— + < 1. Substitute 
this in the second term in ( 1211 1. and we see that it is summable. 
Finally, to show that X]n>i < modify (1211 1 as 




^jAlQ Destination t 



Figure 2. Single source single sink setting indicating how matching arises. 



a^2 



n>no 



By our choice of gq and (5, we also have n^"' < C(l + (5)°", 
and so all the steps that followed (i2Tl i apply, which establishes 
summability of n£„. ■ 

We now put these together to argue that a bipartite matching 
exists in G{n,n,p) with high probability. 

Theorem 8: The probability that G{n, n,p) does not have a 
complete matching is upper bounded by 7(71) := 2n(l— + 
En, where e„, defined in ( l20l ). has all the properties indicated 
in Lemma I2I □ 
Proof: If G{n,n,p) does not have a complete matching, 
then either (1) there is an isolated vertex, or (2) there is no 
isolated vertex and by virtue of Lemma|6] U^^iFa must occur, 
where ni = (n + l)/2 as before. By Lemma|7] the probability 
of the second case event is at most e„. The probability that 
there is no isolated vertex is, by the union bound, at most 
2n{l-p)". m 

In the previous section, we had a need to study existence of 
bipartite matchings over left and right sets of size /3„ := [crij 
where < c < 1. 

Lemma 9: For a fixed < c < 1, let /3„ [cn\. 
The probability that G{Pn, Pn,p) does not have a complete 
matching is upper bounded by 7(/3n) where 7 is the up- 
per bounding function defined in Theorem [8] Furthermore, 

E„>l"7(^n) < OO- □ 

Proof: The upper bound on the probability that a match- 
ing does not exist is immediate. We now show that "■7(/3n) 
converges. Note that any particular integer repeats at most 
l/c+ 1 times in the sequence n > 1}. As a consequence 



E"7(/3rO < i5]M.7(/3„) 

n>l n>l 

< l^(/3„ + l).T,(/3„) 



- - (- + 1) E('= + l)"^(^) < 



fc>i 



VI. A Digression of Not Just Interpretive Value: 
Maximum Single Commodity Flow 

Let us now take a step back to see how matching arises 
naturally in the simpler case of a single commodity flow 
between a source node s and a sink node t. We shall assume 
that additional nodes 1, 2, . . . , n are merely relays. The random 
graph of interest is now G{n + 2,p), where the number n + 2 
comes from n relay nodes and the two source and sink nodes. 
Our interest is in the maximum rate of information flow 
between source and sink 7r„(2). (To be strictly conforming 
to our earlier notation, we must use 7r„+2(2) for there are 
n + 2 nodes in the network and with the first two nodes being 
in session. The asymptotics does not change of course). 

Grimmett and Suen |19| showed that 7r„(2) grows linearly 
in n and that lim„ ^"p-* — p, almost surely. It is then clear 
that the cut that isolates the source is a tight cut. So is the 
cut that isolates the sink. Motivated by this, Karp et al. IIT2I 
provided an algorithm that achieves the minimum cut capacity. 
We will show that, for a fixed e > 0, the following algorithm 
transports np{l — e) bits from the source to the sink with 
vanishing probability of failure. See Figure |2l 

Algorithm Max Flow: 

• The source floods exactly np{l — e) links with one bit 
per link. 

• The sink pulls all these bits from np{l — e) links con- 
nected to it in the following two steps. 

(a) If any node connected to the sink is directly connected 
to the source, the sink draws the corresponding bit. With 
overwhelming probability, there are at least np{l — e) ■ 
p{l — e) such connections. 

(b) Here is how the sink draws the remaining bits. There 
are at most /3„ = np{l — e)(l —p{l — e)) such yet-to-be- 
pulled bits, and these reside with let us say source side 
relays not in direct contact with the sink. Among those 
relays that did not get a bit directly from the source (and 
these are n — np{l — e) = n(l — p(l — e)) in number) 
the sink is connected to at least n(l — p{l — e)) • p{l — 
e) = Pn, again with overwhelming probability. Let us 
call these the sink side relays. There is a matching, again 
with overwhelming probability, between the source side 
relays and the sink side relays. This matching is then used 
in the obvious way to draw the yet- to-be-pulled bits. □ 

Obviously, the direct link between s and t is inconsequential 
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for the asymptotics. It is further obvious from the analysis 
of the previous section that the probabiHty of failure is 
overwhelmingly small, and moreover, it is summable over n 
(Lemma|9]l. This is essentially the argument of Karp et al. IIT2I 
to show the achievability direction of the result of Grimmett 
and Suen |19|. 

What if we have not one sink t, but two sinks ti and t2? 
There is one matching needed for ti and another needed for <2- 
These matchings depend on the connections at the respective 
sinks, but can be found with overwhelmingly small probability 
of failure via the union bound for probabilities. Once these 
are found, while the relays may be overworked, the links are 
utilized within their capacity limits. Indeed, if a common sink- 
side relay is required to deliver the same bit (from a particular 
source side relay) to both sinks, then the relay simply copies 
the obtained bit on both links to the sinks. If the relay is 
required to supply two different bits to the two sinks, the 
matchings are to different bits, the relay fetches the two bits 
from the respective source side relays on two different links 
(as per matching), and supplies them to the two sinks via two 
different links. This matching on an as-needed basis minimizes 
link usage. But every time a new sink is added, new flows 
should be initiated to make all bits available to the new sink. 
Can we prepare the network to be in a state of readiness so 
that upon addition of a new sink, it is merely the new sink 
that does the necessary work to obtain all bits? 

Our next goal is to modify Algorithm MaxFlow into one that 
pushes two steps and then pulls, as in Algorithm ALLCAST, 
yielding a decentralized algorithm that easily extends to the 
case of multiple sinks. 

Consider the single source single sink case again, and the 
following algorithm. 

Algorithm MaxFlowPUSHPULL: 

• Push step 1: The source node s floods np{l — e) 
links with one bit per link. We shall call the bits 
61, 62, • ■ • , &np(i-e) and the recipient nodes of these bits 
as the owners Oi, O2, • • ■ , 0„p(i_j) of the respective 
bits. All other nodes are termed relays and indexed 

Rnp{l-e) + l^ • ■ • 7 Rri- 

• Push step 2: Each owner Oi pushes his bit bi one more 
level, but only to neighbors who are not owners, and to 
the sink t if there is a link to the sink. Owner-owner links 
are unutilized. 

• Pull step 1: The sink t collects all bits sent directly by 
owners. 

• Pull step 2: The sink t identifies the list of additional bits 
needed, the list of relays it is connected to, the list of bits 
they have in their possession, and does an appropriate 
matching of relays with the required bits. It then pulls 
the desired bits from these relays via the by now all-too- 
familiar matching. □ 

The bit-map for this setting is much simpler (see Table HHi. 
The columns are indexed by the bits. The rows are indexed by 
the nodes, with the first — e) representing the owners and 
the rest representing the relays. Row i, when it corresponds 
to owner Oi (which is when 1 < i < np{l — e)) has a 1 only 
on the zth column. But when row i corresponds to a relay 



(which is when i > np{l ~ e)), it has entry Xij = 1 if Oj is 
connected to Ri. Clearly, the presence or absence of this link 
is independent of the status of all other links, and Xi^j is a 
Bernoulli(p) random variable, when i > np{l — e) > j. 



Table II 

Bit-map for one source one sink flow 
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We then have the following result. 

Theorem 10: For any e > 0, the following event occurs 
almost surely: for all but finitely many n, the algorithm 
MaxFlowPUSHPULL succeeds in transporting all np{l — e) 
bits from the source s to the sink t. □ 
Proof: This is almost immediate. If the algorithm fails, 
one of the following must happen. 

(1) The event A^^^ occurs, which is the event that node s 
is connected to less than np{l — e) relays. By Lemma|5] there 
is a ci > such that for all sufficiently large n, we have 

(2) The event Aj"^ occurs, which is the event that the sink t 
is connected to a number of owners outside the range np{l — 
e) ■ p{l ± e). Again by Lemma |5] there is a C2 > such that 
for all sufficiently large n, we have Pr{A^"^} < e-"^"^ for 
some C2 > 0. 

(3) The event ylg"^ occurs, which is the event that the sink 
t is connected to fewer than /3„ :— n{l — p(l — e)) ■ p{\ — e) 
relays. Again by Lemma |5l there is a C3 > such that for all 
sufficiently large n, we have Pr{A^"^} < e-"=^ 

(4) If A^"^ U A^"^ U A^"^ does not occur, the number of bits 
that remain to be pulled is at least np{\—£)—np{l—e)-p{l+£) 
which is at most /3„. The number relays that can help the sink 
pull these bits is at least /3„. For the algorithm to fail, the event 
Af that there is no coverage of the yet-to-be-pulled bits by 
the available relays with each relay accounting for at most one 
bit (capacity constraint), must then occur. This implies that if 
a particular set of /3„ relays are chosen, there is no coverage of 
the required bits. This further implies that any superset of /3„ 
bits that includes the yet-to-be-pulled bits cannot be covered 
by the /?„ chosen and available relays. 

The matrix rows corresponding to the /3„ chosen relays 
(rows) and the /3„ chosen bits (columns) is a /3„ x /3„ square 
submatrix whose entries are conditionally iid Bernoulli (p) 
random variables. Again, we may view this as a bipartite graph 
with the chosen relays on the one side and chosen bit indices 
on the other side. Thus, if A^l^^VjA^2^^\jA'i^^^ does not occur, but 
M*^"-' does, then there is no matching on the random bipartite 
graph. Using Theorem [8] the probability that such a matching 
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Source 




Figure 3. The relay n) network. Source pushes bits to owners who 
then push to relays (solid lines). The sinks pull the bits from either owners 
or relays (dashed lines). 

does not exist, conditioned on (A^"-* UA2^^ U Ag"-*)"^, is upper 
bounded by 7(/3n)- 

Thus, the event that the sink is unable to pull all the bits 
implies the event 

4")u4"^u4"^UAf("), 

and its probability is upper bounded by 

g-nci ^ g-nc. ^ g-«C3 ^ ^^p^y (22) 

This is summable by Lemma |9] and the rest follows. ■ 
Instead of one sink, suppose we have two sinks ti and t2 
that are not connected directly to each other or directly to 
the source. The source has to transport all its np{l — e) bits 
to each of the two sinks using only the n relay nodes. We 
may continue to use MaxFlowPUSHPULL with the following 
extension. The two push steps are common. But each sink 
simply executes its own pull steps based on the connections 
it sees at its end and the information from its helper nodes. 
Using the union bound, it immediately follows that Theorem 
[To] holds for one source and two sinks when there are no direct 
connections between the set of nodes constituted by the source 
and the sinks. 

Indeed, we can say something much stronger. One version 
that suffices to address the multicast setting of the next section 
is the following. Consider a scenario where there is one 
source s and a total of /c„ — 1 sinks ti,t2, ■ ■ ■ , tk„-i where 
sup„>]^ ^ < C for some C < oo. The source and the sinks 
have no links among themselves, but are connected through a 
network of n relays. See Figure |3] The internal links between 
the relays and the links between the source/sinks and the relays 
are iid Bernoulli(p) random variables. The source wishes to 
transfer all its bits of information to each of the sinks. Let us 
denote this random network as relay(fc„,n). 

Theorem 11: For any e > 0, the following event occurs 
almost surely: for all but finitely many n, the algorithm 
MaxFlowPUSHPULL, with the pull stages implemented by 
each sink, succeeds in transporting all npil — e) bits from 
the source s to each of the fc„ — 1 sinks on the relay (fc„,n) 
network. □ 



Proof: Observe that the first three terms in the upper 
bound for the probability of failure in (l22l) decay exponentially 
fast in n. The last term 7(/?„) satisfies X]ri>i 
Since there are fc„ — 1 = 0{n) sinks, by the union bound, the 
probability that the algorithm fails for some sinks is at most 
Cn(e~"^i +6-"^== +e-"'=3 +7(/3n))- This upper bound is 
summable, and the rest follows. ■ 
A related model was considered by Ramamoorthy et al. 
II20I . In their random network model, between each pair of 
nodes, there are two links, one in each direction, with equal but 
random capacity. The random variables were again iid. They 
identified how the minimum cut capacity, which is also the 
multicast capacity in directed settings, scales with the number 
of relays. Our achievability result is, in contrast to that of 
II20I, constructive. Further, thanks to the undirected nature of 
links in our model, our ability to choose directions flexibly 
enables us to reach the network upper bound, asymptotically, 
with flows. 

VII. Multicast: Achievability 

We now return to the setting of n nodes of which fc„ are 
in a multicast session. Node 1 is the source node and nodes 
2,3,..., kn are the sinks. Our goal in this section is to show 
that the upper bound ( fTOl i is achievable. While one could in 
principle proceed as in Catlin et al. lITSI to prove achievability, 
we shall directly jump to a constructive proof. 

Theorem 12: For the multicast problem with fc„ nodes in 
the session, let lini„^oc kn/n = a G [0, 1]. We then have 

lim ^ = f 1 - ^) E[q a.s. 

□ 

Proof: As in the proof of Theorem [3] converse was 
already shown in ( fTol ). So showing achievability suffices, 
and further showing it on Erdos-Renyi random graphs with 
parameter p suffices. Moreover, as before, it is enough to show 
that: For any e > 0, the following event occurs almost surely: 
for all but finitely many n, there is an algorithm that succeeds 
in transporting 7r„(fc„) > n (1 — a/2)p{l — 2e) bits from the 
source to each of the fc„ — 1 sinks. We claim that this holds. 
We first dispose two easy cases. 

When a = 0, this follows from Theorem [TT] by simply 
ignoring the links between the session nodes and by using 
MaxFlowPUSHPULL and n(l — e) relays, and with puUs 
implemented at each of the sink nodes. 

When a — 1, pretend that all nodes are in session and 
implement ALLCAST. The result follows from Theorem ID 

Only the case when < a < 1 remains, for which we will 
use a combination of the above. 

Observe that the subset of session nodes alone form a 
complete graph with fc„ vertices for which Theorem |4] is 
applicable. Using ALLCAST and without using any of the relay 
nodes, we have that the source can distribute 

> ^P{1 - e) (23) 

bits to the other fc„ — 1 nodes in the session, for all but finitely 
many n, almost surely. (Summability of the probability upper 
bound sequence holds since fc„ = f2(n)). 
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Removing these direct Hnks between the session nodes, we 
end up with the graph in Figure [3j where the session nodes are 
now only connected to the m„ = n — kn relay nodes. The link 
to each relay node from each session node has Bernoulli (p) 
capacity. Further the relay nodes have interrelay link capacities 
that are independent Bernoulli(p) random variables. By The- 
oremfm using MaxFlowPUSHPULL, the source can distribute 



d^' > m„p(l 



(24) 



bits to the fc„ — 1 sinks (solely with the help of the relay nodes), 
for all but finitely many n, almost surely. (Summability of the 
probability upper bound sequence holds since to„ — n{n)). 

The result immediately follows from ( |23] i and ( l24b since 
7r(fcn) > T^n^ + T^n^ and fc„/2 + TO„ = n — fc„/2 > n{\ — 
a/2)(l — e) for all sufficiently large n. ■ 



3) Let A]^' not occur Then there are exactly ^npn{l — e) 
owners. For some node t, the event Ai^^\t) occurs, which is 
the event that the node t is connected to fewer than 

(^n - ^npn{l - e)j ■ - e) 

= ^np,,{l -e)-(^l- - e)^ (26) 

relays with links pointing towards t. (As before, the case of 1 
less relay when node i is a relay is easily handled). Once again 
by Lemma |5] there is a C3 > such that for all sufficiently 
large n, we have 



< 
< 



-C3 \/nrn log TL 



Vin. Vanishing Link Probabilities 

Our results extend to the case when p is a function of n, 
denoted p„, and vanishes but sufficiently slowly. We shall 
focus only on the allcast problem. The results for multicast 
can be obtained in an analogous fashion. 

Theorem 13: Let p„ = ^" " " where r„ — 00 but 
Pn 0. For any e > 0, the following event occurs almost 
surely: for all but finitely many n, the algorithm ALLCAST 
succeeds in distributing inp„(l — e) bits to each of the n 
nodes. Furthermore, lini„^oo — \ almost surely. □ 
Proof: The proof of the first part is similar to the proof of 
Theorem]?! with some additional effort to get better probability 
upper bound estimates. Again, we argue that the probability 
that algorithm ALLCAST fails is summable over n. If the 
algorithm fails for a particular n, at least one of the following 
events must have occurred. 

1) The event occurs, which is defined to be the event 
that there are fewer than inp„(l — e) vertices connected to 
node 0. By Lemma |5] applied with q = p„/2, there is some 
Ci > such that for all sufficiently large n, we have 



Pr{A(")} < 



'n-^p„-£^/3 ^ — c-iy/riTn log n 



2) For some node t, the event A'"2\t) occurs, which is 
defined to be the event that node t is connected to a certain 
number of owners outside the range i7ip„(l — e) ■ \pn{^ ± £) 
with links pointing towards t. (The case when node t is an 
owner leads to one fewer number of owners which as before 
is absorbed into (1 ± e) factor). Again by Lemma |5] there is 
some C2 > such that for all sufficiently large n, we have 

Pr{4"^(t)} < e-5»P"(i-^)ip-^V3 



—C2Tn log 71 



(25) 



Note that C2 can be arbitrarily small because of the factor 
Since we need n times Pr{A2"''(t)} to go to zero, see (|29] l 
which comes later, it is here where we utilize the assumption 
that T„ — > 00. 



4) For some node t, if A^"^ U A^^'\t) U A^^'\t) does not 
occur, then the event Al^^'^t) occurs, which is the event that 
node t is unable to pull the desired bits. We claim that 

Pr {Af(")(t) I (a^") U U 4"^*))'} < (27) 

where 

00 

nSn < 00. (28) 

n=l 

The event that the algorithm fails is thus a subset of 

n 

A"^ U u A"\t) u M(")(t)) 

whose probability is upper bounded via the union bound and 
(|27l) by 

n ■ ( e-"' Vnr„ iosn ^ _J_ ^ g-C3Vnr„logn ^ ^ (29) 

By (|28] | and the assumption that t„ — > oo, we see that this 
bound is summable. 

What remains is to prove ( |27] ) and ( l28b . 

As before, the probability on the left-hand side of ( |27] l is 
upper bounded by the probability that there is no matching in 
a bipartite graph with /3„ vertices and link probability p„. 

We first sharpen Lemma [T] The bound in ( fT9] l, after noting 
that we now have /3„ vertices on one side, can be sharpened 
(see [18, p.l74]) to 



PT{Fa} < 2 



f3r 



f3n 

a~l 



a(a - 1) \ 2a-2 

2a - 2 ' 



where the extra term within parentheses in the second line can 
be included because it is an upper bound (via the union bound) 
on the probability that some 2a — 2 links, among the possible 
a (a — 1) links from A to r(^), are active. Recall that a is 
an integer satisfying 2<a<(/3„ + l)/2. Using the bounds 
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(T) < (TT (1 



< e ^, we get 



IX. Discussion 



Fr{Fa} 



< 2 



< 2 



ePn\"' f e/3„ \ ' (ea\ 



2a-2 



2a -2 



a J \ 2 y \^ a— 1 

j^g-/3„P„a(l-^ + ^) 



a-l 



for some finite constant C, where in the last inequality we 
have used (1 + l/k)'' < e, the bound 1 - (a - !)//?„ > 1/2 
when 2 < a < (/?„ + l)/2, and the obvious upper and lower 
bounds on /3„ from ( |26] |. Now, using np^^ = r„ logn, we get 



Pr{i^a} < Cv/nT„log, 



e^r„logn 



2a-2 



-ar„(l-2e)/4 



< C 16 



e4(T„logn)i-V Vl6n^"(i-2s)/4 



Since the term inside the second parentheses converges to zero 
as n — > oo, it follows that for all sufficiently large n and some 
finite constants Ci and C2, we have 



(0. + l)/2 

Pr{Fa} < Ci 

a=2 

= C2 



(r„logn)i-5 J \^16n^-»(i-2e)/4 j 
/ V7^(rn log n)^- ^ ' 

I „r„(l-2e)/2 



The probability that there is no matching is then upper 
bounded by (5„ := k„ + 2/3„(l — The second term 

is upper bounded, using the bounds on /3„, as 

2A.(l-p,.)»-<~P,.e-i"-«" = ^|^^ 

From these two bounds, using t„ — > 00, it is clear that not only 
Sn — > 0, but in addition, J2n>i "-'^'i < This establishes 
( I27I 1 and (|28]l and proves validity of algorithm ALLCAST. 
The above achievability result also establishes that 

lim inf — — > — . 

n->oo npn 2 



The upper bound 



lim sup < - 

npn 2 



n— )-oo '^/-'n 



follows from (|7]) and Lemma |5] This concludes the proof of 
the second statement. ■ 
The extension to multicasting can be done similarly. 



We began with the problem of allcast and multicast capacity 
region for multiple allcast and multiple multicast. Yet, we 
largely focused on single allcast or single multicast with 
just one sender and with remaining nodes of the session as 
receivers. But study of single multicast suffices, thanks to the 
result |T, Cor 4. a] of Li et al. on transferability of rates across 
sources (even with network coding). It is therefore clear how 
the established results imply the validity of ([T]l and (|2]l. The 
requirement that the session nodes be identical for each of the 
multiple multicasts is crucial for this transferability. 

Moreover, we largely studied multicasting techniques that 
do not use network coding. One message coming out of 
this work is that though network coding provides a coding 
advantage in specific undirected scenarios, and one such 
example can be found in Li et al. [14|, in large dense random 
undirected networks of the variety studied in our paper the 
coding advantage is at most 1 + o(l) in the number of nodes. 
While our results applied to graphs G{n,pn) with p„ 0, we 
did require that p„ vanishes sufficiently slowly. In particular, 
Pn — \J (t„ log n)/n so that a typical node has degree 
npn — y/riTn log n. These are well connected, but by no means 
sparse graphs. This naturally raises two questions. (1) Can one 
extend these results to some useful classes of sparse random 
graphs? (2) Can one find the rate at which the expected rates 
for the proposed strategies converge to their asymptotic limits, 
and show concentration around the expectations? 

The result of asymptotically negligible network coding ad- 
vantage in single or multiple multicast settings (with identical 
session nodes) may evoke the question of a possible connec- 
tion with a conjecture of Li and Li ll2Ti for multiple unicasts. 
Li and Li |21| conjectured that for multiple unicast, network 
coding provides no coding advantage in undirected graphs. 
While their conjecture holds true for some specific classes of 
undirected graphs if22], f23\), the general conjecture remains 
unresolved. The negligible gain for multicasting in random 
graphs studied here arises from the dense interconnectivity 
between relays. The bottlenecks are primarily at the periphery. 
So there does not seem to be much insight that one can glean 
from our study to prove or disprove the Li and Li conjecture 
for multiple unicasts in undirected networks. 

While we studied multiple multicasts, our communication 
application naturally restricted us to a single set of session 
nodes. We thus had to study Steiner tree packings for a single 
subset of nodes. VLSI applications require efficient packing 
of Steiner trees across a multiplicity of such subsets (or nets; 
see |8 |). One could apply our random network framework to 
such problems and attempt to devise similar quick-but-dirty 
algorithms. This is an interesting topic that is beyond the scope 
of this paper. 
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